Statistical Linearization for Value Function Approximation in Reinforcement Learning
نویسنده
چکیده
Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists in learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important RL subtopic is to approximate this function when the system is too large for an exact representation. This paper presents statistical-linearization-based approaches to estimate such functions. Compared to more classical approaches, this allows considering nonlinear parameterizations as well as the Bellman optimality operator, which induces some differentiability problems. Moreover, the statistical point of view adopted here allows considering colored observation noise models instead of the classical white one; in RL, this can provide useful.
منابع مشابه
Function Approximation in Hierarchical Relational Reinforcement Learning
Recently there have been a number of dif ferent approaches developed for hierarchi cal reinforcement learning in propositional setting We propose a hierarchical version of relational reinforcement learning HRRL We describe a value function approximation method inspired by logic programming which is suitable for HRRL
متن کاملActive Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares policy iteration (LSPI) framework allows us to employ statistical active learning methods for linear regression. Then we propose a design method of good sampling policies for efficient exploration, which is particularl...
متن کاملBayesian Reinforcement Learning with Gaussian Process Temporal Difference Methods
Reinforcement Learning is a class of problems frequently encountered by both biological and artificial agents. An important algorithmic component of many Reinforcement Learning solution methods is the estimation of state or state-action values of a fixed policy controlling a Markov decision process (MDP), a task known as policy evaluation. We present a novel Bayesian approach to policy evaluati...
متن کاملA Brief Survey of Parametric Value Function Approximation A Brief Survey of Parametric Value Function Approximation
Reinforcement learning is a machine learning answer to the optimal control problem. It consists in learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important subtopic of reinforcement learning is to compute an approximation of this value function when the system is too large ...
متن کاملModel-based reinforcement learning using on-line clustering
A significant issue in representing reinforcement learning agents in Markov decision processes is how to design efficient feature spaces in order to estimate optimal policy. The particular study addresses this challenge by proposing a compact framework that employs an on-line clustering approach for building appropriate basis functions. Also, it performs a stateaction trajectory analysis to gai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010